fix(markdown_parser): paragraph with trailing hard break absorbs following blank line#9931
Conversation
|
4a162df to
6b0765f
Compare
Merging this PR will not alter performance
Comparing Footnotes
|
WalkthroughThis PR fixes the markdown parser's handling of hard line breaks ( Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/biome_markdown_parser/src/syntax/mod.rs (1)
1083-1099:⚠️ Potential issue | 🟠 MajorSynchronise inline pre-scan with the new hard-break boundary rule.
Line 1097 now terminates on a bareNEWLINEafter a hard break, butinline_list_source_len/scan_newline_in_inline_listdon’t track that state. This can overrun the emphasis context into the next paragraph and skew delimiter matching.💡 Suggested fix sketch
fn inline_list_source_len(p: &mut MarkdownParser) -> usize { let start: usize = p.cur_range().start().into(); p.lookahead(|p| { let mut has_content = false; + let mut after_hard_break = false; loop { if p.at(T![EOF]) { break; } if p.at(NEWLINE) { + if after_hard_break { + break; + } if scan_newline_in_inline_list(p, has_content) { break; } + after_hard_break = false; continue; } + if after_hard_break + && p.at(MD_TEXTUAL_LITERAL) + && p.cur_text().chars().all(|c| c == ' ' || c == '\t') + { + p.bump(MD_TEXTUAL_LITERAL); + continue; + } + if !p.cur_text().chars().all(|c| c == ' ' || c == '\t') { has_content = true; } + after_hard_break = p.at(MD_HARD_LINE_LITERAL); p.bump(p.cur()); } let end: usize = p.cur_range().start().into(); end.saturating_sub(start) }) }Also applies to: 1145-1149
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/syntax/mod.rs` around lines 1083 - 1099, The inline pre-scan must respect the new hard-break boundary: modify inline_list_source_len and scan_newline_in_inline_list so they are aware of the after_hard_break condition (or accept a flag) and stop scanning when a bare NEWLINE follows a hard break, mirroring the loop in mod.rs that breaks on NEWLINE when after_hard_break is true; update calls to inline_list_source_len/scan_newline_in_inline_list from the parser loop (where after_hard_break is set) to pass the state and ensure emphasis delimiter matching does not continue past that boundary.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@crates/biome_markdown_parser/src/syntax/mod.rs`:
- Around line 1083-1099: The inline pre-scan must respect the new hard-break
boundary: modify inline_list_source_len and scan_newline_in_inline_list so they
are aware of the after_hard_break condition (or accept a flag) and stop scanning
when a bare NEWLINE follows a hard break, mirroring the loop in mod.rs that
breaks on NEWLINE when after_hard_break is true; update calls to
inline_list_source_len/scan_newline_in_inline_list from the parser loop (where
after_hard_break is set) to pass the state and ensure emphasis delimiter
matching does not continue past that boundary.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 71398d5b-4631-477c-804d-7493537f7a79
⛔ Files ignored due to path filters (2)
crates/biome_markdown_formatter/tests/specs/markdown/hard_line.md.snapis excluded by!**/*.snapand included by**crates/biome_markdown_parser/tests/md_test_suite/ok/hard_line_break_paragraph_split.md.snapis excluded by!**/*.snapand included by**
📒 Files selected for processing (3)
crates/biome_markdown_parser/src/syntax/mod.rscrates/biome_markdown_parser/tests/md_test_suite/ok/hard_line_break_paragraph_split.htmlcrates/biome_markdown_parser/tests/md_test_suite/ok/hard_line_break_paragraph_split.md
Note
This PR was created with AI assistance (Claude Code).
Summary
Fixes #9857.
When a paragraph ends with a hard line break (
\n) followed by a blank line and another paragraph, the parser incorrectly merged both paragraphs into a singleMD_PARAGRAPHnode. The root cause:MD_HARD_LINE_LITERALalready consumes the line-ending newline, so the followingNEWLINEtoken is the blank-line separator, but the existing inline loop did not recognize it as a paragraph boundary.The fix hoists
after_hard_breakstate across loop iterations and breaks the paragraph when a bareNEWLINEfollows a hard line break. Container continuations (blockquotes, list items) are unaffected because their continuation tokens (>, indent) appear before anyNEWLINE.Also simplifies the adjacent whitespace-trivia consumption by removing a redundant outer guard.
Test Plan
just test-crate biome_markdown_parserjust test-markdown-conformancejust test-crate biome_markdown_formatterjust lintDocs
N/A